Affective computing, especially from speech, is one of the key steps toward building more natural and effective\r\nhuman-machine interaction. In recent years, several emotional speech corpora in different languages have been\r\ncollected; however, Turkish is not among the languages that have been investigated in the context of emotion\r\nrecognition. For this purpose, a new Turkish emotional speech database, which includes 5,100 utterances extracted\r\nfrom 55 Turkish movies, was constructed. Each utterance in the database is labeled with emotion categories (happy,\r\nsurprised, sad, angry, fearful, neutral, and others) and three-dimensional emotional space (valence, activation, and\r\ndominance). We performed classification of four basic emotion classes (neutral, sad, happy, and angry) and estimation\r\nof emotion primitives using acoustic features. The importance of acoustic features in estimating the emotion primitive\r\nvalues and in classifying emotions into categories was also investigated. An unweighted average recall of 45.5% was\r\nobtained for the classification. For emotion dimension estimation, we obtained promising results for activation and\r\ndominance dimensions. For valence, however, the correlation between the averaged ratings of the evaluators and the\r\nestimates was low. The cross-corpus training and testing also showed good results for activation and dominance\r\ndimensions.
Loading....